PhyStat2003, SLAC, September 8-11 



1 



Challenges in Moving the LEP Higgs Statistics to the LHC 

K.S. Cranmer, B. Mellado, W. Quayle, Sau Lan Wu 
University of Wisconsin-l\/ladison, Madison, Wl 53706, USA 



We examine computational, conceptual, and philosophical issues in moving the statistical techniques used in 
the LEP Higgs working group to the LHC. 
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1. Introduction 

Higgs searches at LEP were based on marginal sig- 
nal expectations and small background uncertainties. 
In contrast, Higgs searches at the LHC are based on 
strong signal expectations and relatively large back- 
ground uncertainties. Based on our experience with 
the LEP Higgs search, our group tried to move the 
tools we had developed at LEP to the LHC environ- 
ment. In particular, our calculation of confidence lev- 
els was based on an analytic computation with the 
Fast Fourier Transform and the log-likelihood ratio as 
a test statistic (and systematic errors based on the 
Cousins-Highland approach). We encountered three 
types of problems when calculating ATLAS' com- 
bined sensitivity to the Standard Model Higgs Boson: 
problems associated with large numbers of expected 
events, problems arising from very high significance 
levels, and problems related to the incorporation of 
systematic errors. 

Previously, it was shown that the migration of the 
statistical techniques that were used in the LEP Higgs 
Working Group to the LHC environment is not as 
straightforward as one might naively expect ,1]. Af- 
ter a brief overview in Section|21 those difficulties and 
their ultimate solution arc discussed in Section|31 Our 
group has developed two independent software solu- 
tions (both in C-l— 1-; both with FORTRAN bindings; one 
ROOT based and the other standalone) which can be 
found at: 

http : //Wisconsin . cern . ch/ software' 

In Section 0] we discuss the incorporation of sys- 
tematic errors and compare a few different strategies. 
In Section \^ we present and discuss the discovery lu- 
minosity (the luminosity expected to be required for 
discovery). Lastly, in Section Elwe discuss the statis- 
tical notion of power (which is related to the probabil- 
ity of Type II error (the probability we do reject the 
"signal-plus-background hypothesis" when it is true) . 



2. The Formalism 

Our starting point for this note is a brief review of 
the techniques that were used at LEP. We refer the 
interested reader to Q for an introduction to the fun- 
damentals, to 0] for why the likelihood ratio has been 
chosen as a test statistic, to lij for a Monte Carlo 



approach to the calculation and to ^] for the ana- 
lytic calculation using Fast Fourier Transform (EFT) 
techniques. For completeness, we introduce the ba- 
sic approach below using the notation found in Q]. 
For a counting experiment where we expect, on aver- 
age, b background events and s signal events, we con- 
sider two hypotheses: the null (or background-only) 
hypothesis in which the number of expected events, 
n, is described by a Poisson distribution P(n; b) and 
the alternate (or signal-plus-background) hypothesis 
in which the number of expected events is described 
by a Poisson distribution P{n; s + b). Here the number 
of events serves the purpose of a test statistic: a real 
number which quantities an experiment. 

It is possible to include a discriminating variable 
X which has some probability density function (pdf) 
for the background, /b(cc), and some pdf for the sig- 
nal, fsix), both normalized to unity. Given an ob- 
servation at X we can construct the Likelihood Ratio 
Q = {sfs{x) + bfb{x))/bfb{x). With several indepen- 
dent observations {xi} we can consider the combined 
likelihood ratio Q = YlQi ■ It is possible, and in some 
sense optimal, to use Q (or in practice q — hiQ) as a 
test statistic. 

The computational challenge of using the log- 
likelihood ratio in conjunction with a discriminating 
variable x is the construction of the log-likelihood ra- 
tio distribution for the background-only hypothesis, 
Pbiq), and for the signal-plus-background hypothesis 
Ps+b{q)- In this case, there are not only the Poisson 
fluctuations of the number of events, but also the con- 
tinuously varying discriminating variable x. In partic- 
ular, for a single background event the log-likelihood 
ratio distribution, pifi{q), must incorporate all possi- 
ble values of x. From these single event distributions 
we can build up the expected log-likelihood ratio dis- 
tribution by repeated convolution. This is most effec- 
tively done by using a Fast Fourier Transform (FFT) 
where convolution can be expressed as multiplication 
in the frequency domain (denoted with a bar). In 
particular we arrive at: 



Pb{q) 



,h[Pi,6(<7)-l] 



and 



(1) 



Ps+b{q) = 



From the log-likelihood distribution of the two hy- 
potheses we can calculate a number of useful quan- 
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titles. Given some experiment with an observed log- 
likelihood ratio, q* , we can calculate the background- 
only confidence level, CL^ : 



Pb{q')dq' 



(2) 



In the absence of an observation we can calculate the 
expected CLf, given the signal-plus-background hy- 
pothesis is true. To do this we first must find the me- 
dian of the signal-plus-background distribution qg+b- 
From these we can calculate the expected CLh by us- 
ing Eq. [51 evaluated at q* = qg+b- 

Finally, we can convert the expected background 
confidence level into an expected Gaussian signifi- 
cance, N(T, by finding the value of N which satisfies 



CLbiqs+b) 



1 - erf (A^/ a/2) 



(3) 



where erf(A^) — {2/n) exp{—y^)dy is a function 
readily available in most numerical libraries. 



3. Numerical Difficulties 

The methods described in the previous section have 
been applied to the combined ATLAS Higgs effort 
with some caveats related to numerical difficulties [J . 
In particular, in the extreme tails of Pb{q), the prob- 
ability density is dominated by numerical noise. This 
numerical noise is an artifact of round-off error in 
the double precision numbers used in the Fast Fourier 
Transform^. The noise is on the order of 10~^^ (for 
double precision floating point numbers) , which trans- 
lates into a limit on the significance of about 8a. For 
particular values of the Higgs mass, ATLAS has an 
expected significance well above 8a with only 10 fb~^ 
of data. In order to produce significance values above 
the 8(7 limit, various extrapolation methods were used 
in P]. We now introduce a definitive solution to this 
problem based on arbitrary precision floating point 
numbers. 

It should be made clear that the numerical precision 
problem is not due to the fact that the CLb is so small 
that the evaluation of the integral in Eq. [21 cannot be 
treated with double precision floating point numbers. 
Instead, the numerical precision problem is due to the 
many (approximately 2^") Fourier modes which must 
in total produce a number very close to 0. In order 
to rectify this problem we have implemented the Fast 
Fourier Transform with the arbitrary-precision float- 
ing point mmrbers provided in the CLN library^ Jjj. 
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^We use the FFTW library: |http://www.ffl:w.org| 
^CLN is available at |http:77^ww^ginaa5e| 



Figure 1: The distribution of the log-likelihood ratio p{q) 
for the null and alternate hypothesis (the axis labels refer 
to bins of q, not q itself). For q > 10^ the distribution is 
contaminated by numerical noise (see text for details). 



One might protest that above 5cr we are not interested 
in the precise value of the significance and that this 
exercise is purely academic. We refer the interested 
reader to Sections [SI & [HI for different summaries of 
an experiments discovery potential. 



3.1. Extrapolation 

While the arbitrary precision FFT approach is the 
deflnitive solution to the problem of calculating very 
high expected significance, it is also incredibly time 
consuming. A much faster, approximate solution is 
to approximate the CLb by fitting the pb distribution 
to a functional form. The first method of extrapola- 
tion studied was a simple Gaussian fit to the pb dis- 
tribution. This method works fairly well, but tends 
to overestimate the significance. The second method 
we studied was based on a Poisson flt to the pb dis- 
tribution. The Poisson distribution has the desirable 
properties that it will have no probability below the 
hard limit q > —s and that its shape is more appro- 
priate Figure [21 compares these different extrapo- 
lation methods. 



4. Incorporating Systematic Uncertainty 

One encounters both philosophical and technical 
difficulties when one tries to incorporate uncertainty 
on the predicted values s and b found in Eq. ^ In a 
Frequentist formalism the unknown s and b become 
nuisance parameters. In a Bayesian formalism, s and 
b can be marginalized by integration over their respec- 
tive priors. At LEP the practice was to smear pb and 
Ps+b by integrating s and b with a multivariate nor- 
mal distribution as a prior. This smearing technique is 
commonly referred to as the Cousins-Highland Tech- 
nique, and it is has some Bayesian aspects. 
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Figure 2: Comparison of the ATLAS Higgs combined 
significance obtained from several approximate 
teclmiques. The (red) dashed line corresponds to the 
unmodified likelihood ratio which can not produce 
significance values above about 80" (see text). This figure 
is meant to demonstrate the different methods of 
combination and does not include up-to-date results from 
the various Higgs analyses. 

4.1. A Purely Frequentlst Technique 

At the PhysStat2003 conference a purely frequentlst 
approach to hypothesis testing with background un- 
certainty was presented . This method rehes on the 
full Neyman construction and uses a likelihood ratio 
similar to the profile method as an ordering rule. In 
this formalism, a systematic uncertainty at the level 
of 10% has a much larger effect than when treated 
with the Cousins-Highland technique. 

4.2. The CouslnsHlghland Technique 

The Cousins-Highland formalism for including sys- 
tematic errors on the normalization of the signal and 
background is provided in Q and generahzed in [JQ. 
In particular, for a multivariate normal distribution'^ 
as a prior for the rii the distribution of the log- 
likelihood ratio is given by: 

i 



^In principle, any distribution could be used within this 
framework. 



where Sij — {{rii — {ni)){nj — {nj))). Reference 
provides an analytic expression for the resulting log- 
likelihood ratio distribution including a correlated er- 
ror matrix; however, this equation was obtained with 
an integration over negative numbers of expected 
events and does not hold. Attempts to provide a 
closed form solution for the positive semi-definite re- 
gion require analytical continuation of the error func- 
tion over a wide range of the complex plane. Instead, 
a numerical integration over the positive semi-definite 
region has been adopted for our software packages. 



5. Discovery Luminosity 

Because the calculation of expected significance is 
technically very difficult at the LHC, other summaries 
of the discovery potential have been explored. While 
these techniques are not new, it is important to con- 
sider their pros and cons. One such alternate sum- 
mary of the discovery potential is based on the dis- 
covery luminosity" . Define the discovery luminosity, 
L*(m//), to be the integrated luminosity necessary for 
the expected significance to reach 5a. The discov- 
ery luminosity is an informative quantity; however, it 
must be interpreted with some care: 

• Collecting an integrated luminosity equal to the 
nominal discovery luminosity does not guaran- 
tee that a discovery will be made. Instead, with 
L*{mH) of data the median of Ps+b will be at the 
5(7 level - which corresponds to a 50% chance of 
discovery. See Section for more details. 

• In practice an analysis' cuts, systematic er- 
ror, and signal and background efficiencies are 
luminosity-dependent quantities. When we cal- 
culate the discovery luminosity, we treat the 
analysis as constant. 



6. The Power of a ha Test 

The traditional quantity which is used to summarize 
an experiment's discovery potential is the combined 
significance; however, as was noted in Section O this 
plot becomes very dificult to make when the signifi- 
cance goes beyond about Scr. Furthermore, the plot 
itself starts to loose relevance when the significance 
is far above bcr. The discovery luminosity is another 
possible way of illustrating an experiment's discov- 
ery potential, but it must be interpreted with some 
care. A third summary of an experiment's discovery 
potential which is related to the probability of Type 
II error: the power. First, it should be noted that the 
expected significance is a measure of separation be- 
tween the medians of the background-only and signal- 
plus-background hypotheses. Thus, when we see the 
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Figure 3: Examples of power for two different 
signal-plus-background hypotheses with respect to a 
single background-only hypothesis with 100 expected 
events (black). 



as straightforward as one might expect. The first diffi- 
culties are computational and arise from the combina- 
tion of channels with many events and channels with 
few events (these are easily solved). The next dif- 
ficulties are numerical and arise from the extremely 
high expected significance of the high-energy fron- 
tier. These problems can be solved by brute force; 
or they can be reinterpreted as conceptual problems, 
and solved by asking different questions (i.e. power). 
Lastly, there is a philosophical split related to the 
Bayesian and Frequentist approach to uncertainty. At 
the LHC, the choice of the formalism is no longer a 
second-order effect, and this problem is not so easy to 
solve. 
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significance curve cross the 5a line in Fig. [21 there is 
only a 50% chance that we would observe a 5a effect if 
the Higgs does indeed exist at that mass. In practice, 
we claim a discovery if the observed data exceeds the 
5a critical region, and do not claim a discovery if it 
doesn't. The meaning of the 5a discovery threshold 
is a convention which sets the probability of Type I 
error to be 2.85 • 10~^ . With that in mind, the idea 
that the significance is 20a at mn = 160 GeV is irrel- 
evant. What is relevant is the probability that we will 
claim discovery of the Higgs if it is indeed there: that 
quantity is called the power. The power is defined as 
1 — /3 where f3 is the probability of Type II error: the 
probability that we reject the signal-plus-background 
hypothesis when it is true 0. 

Consider FigureOwith a background expectation of 
100 events. The black vertical arrow denotes the 5a 
discovery threshold. The (red) dashed curve shows 
the distribution of the number of expected events for 
a signal-plus-background hypothesis with 150 events. 
Normally, we would say the expected significance is 
5a for this hypothesis; however, we can see that only 
50% of the time we would actually claim discovery. 
The rightmost (blue) curve shows the distribution 
of the number of expected events for a signal-plus- 
background hypothesis with 180 events. Normally, we 
would say the expected significance is 8a for this hy- 
pothesis; however, a more meaningful quantity - the 
power - is associated with the probability we would 
claim discovery which is about 98%. In addition to 
the power being a germane quantity, it is much easier 
to calculate. 



7. Conclusion 

In conclusion, the migration of the statistical tool- 
set developed at LEP to the LHC environment is not 
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